Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tanh 3x faster, and <1.5 ULP vs 2.0 ULP for master #38382

Merged
merged 3 commits into from
Nov 14, 2020

Conversation

oscardssmith
Copy link
Member

The speedup comes from using exp instead of expm1 for most x. For small x (less than log(2)/2 or .5 depending on the datatype), just use a minimax polynomial. This solution is faster, more accurate, and simpler. I believe accuracy is strictly below 1.5 ULP for all inputs, but I wouldn't be shocked if a 1.6 ULP slipped in somewhere. Either way, current behavior has over 2 ulps for 0.233002233002233 and 1.95 for 0.46070245f0, so this is a major accuracy improvement in addition to the speed improvement.

@dkarrasch dkarrasch added maths Mathematical functions performance Must go faster labels Nov 11, 2020
@stevengj
Copy link
Member

stevengj commented Nov 11, 2020

(It looks like we already have test coverage for this.)

@ViralBShah
Copy link
Member

Good to also have @simonbyrne take a quick look if possible.

@oscardssmith
Copy link
Member Author

One specific question about this. At this point, I'm pretty sure we aren't using ldexp_exp for anything anymore. I also think that _ldexp_exp could be much more efficiently implemented by putting it in exp.jl. Is there anything to act on wrt that here or should I open a separate PR?

@ViralBShah
Copy link
Member

If not related to this PR, then make it a separate PR so that this one can go through.

@ViralBShah ViralBShah merged commit e32ae87 into JuliaLang:master Nov 14, 2020
@oscardssmith oscardssmith deleted the better-tanh branch January 26, 2021 04:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maths Mathematical functions performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants